Using the Energy of Robots.txt

Sometimes, we may want search-engines never to list certain areas of the site, as well as exclude other SE from the site altogether.

This really is where a simple, little 2 line text file called robots.txt is available in.

Once we've a web site up and running, we must make certain that all visiting search engines can access all the pages we want them to look at.

Sometimes, we might want search-engines to not catalog certain parts of the site, and on occasion even exclude other SE from the site altogether.

This is the place where a simple, little 2-line text file called robots.txt will come in.

Robots.txt lives in your internet sites main directory (on LINUX systems that is your /public_html/ directory), and looks something like the following:

User-agent: *

Disallow:

The initial line controls the robot that will be visiting your site, the 2nd line controls if they're allowed in, or which parts of the site they're not allowed to visit

Then easy repeat the aforementioned lines, If you want to handle multiple spiders. Get more on this affiliated article - Click here: partner sites. To discover more, people are asked to check-out: get http://www.orangesonline.com/index.cfm.

So an example:

User-agent: googlebot

Disallow:

User-agent: askjeeves

Disallow: /

This will enable Goggle (user-agent name GoogleBot) to go to every page and index, while at the sam-e time banning Ask Jeeves in the site completely. To get a different standpoint, consider looking at: www.orangesonline.com/index.cfm/.

To discover a fairly current list of software person names this visit http://www.robotstxt.org/wc/active/html/index.html

Even if you want to allow every software to index every page of your site, its still very advisable to put a robots.txt file on your own site. It will stop your problem logs filling up with articles from search engines attempting to access your robots.txt file that doesnt exist.

To find out more on robots.txt see, the full listing of resources about robots.txt at http://www.websitesecrets101.com/robotstxt-further-reading-resources. This lovely www.orangesonline.com/index.cfm encyclopedia has limitless cogent tips for the inner workings of it.